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plots as given in Figures 1 9 and in 20 can be used to identify the best fragments of 
^ threading models. Indeed, there are very strong correlations between the lowest 
mobility and the best structural fidelity (to the target structure) of the model chain 
fragments. This may have some other applications, where assessment of the 
reliability of various parts of a model structure is needed. 

10 Summary and Conclusion 

In this example, the invention again was shown to be useful in predicting 
medium- to low-resolution protein structures based on homology or sequence- 
structure compatibility. Here, the initial alignment between the target and template 
was generated by a threading procedure. Of course, alignments also can be obtained 

15 by other means, e.g., from sequence alignments. Such templates are used to guide 
Monte Carlo simulations that employ a reduced protein chain representation built 
using pseudoatoms to represent the side chain center of mass of the various amino 
acid residues of a protein or protein domain. In contrast to the method of example 1 , 
the pseudoatoms of the SICHO model used here took also took account of alpha- 

20 carbon atoms, in addition to the corresponding side chains. This alternate 

embodiment of the model proved capable of making large structural rearrangements 
that, in about a third of studied cases, lead to qualitative improvements in the initial 
poor models. In some other cases, despite a huge decrease in the RMSD between 
die model and the target native structure, the final model was still not satisfactory. 

25 The analysis of the simulation trajectories allows for the plausible identification of 
those cases where the final model improves qualitatively with respect to the initial, 
threading-based model. 

The present invention is useful for large-scale protein structure and function 
prediction. Using the invention, it is possible to identify the biochemical function of 

30 a protein function having a model with a 5-6 A backbone RMSD. 7,8 Certainly, it 

would be much more difficult, if not impossible, to make such an identification for a 
model with an 8 A Cct RMSD from native polypeptide. For example, the model of 
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plastocyanin (2pcy) generated above had its four copper-binding residues much 
^ closer to their native position than predicted by the threading-based model. Thus, 

having a structural template of this active site {e.g., an FSD), the model structure can 
be identified with high fidelity as a copper-binding protein. The results above show 
that for many new or known proteins (e.g., those identified in the course of high 
throughput nucleic acid sequencing programs), the invention can be used to identify 
I q their function(s). The invention also complements sequence-based and threading 
methods, and provides a basis for improving initially poor and incomplete models. 
Additionally, the invention is also complementary to standard homology modeling 
tools, enabling homology modeling in those cases where the template is structurally 
very far from the target structure. 

15 
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